A General Scheme of a Branch-and-Bound Approach for the Sensor Selection Problem in Near-Field Broadband Beamforming

This paper is devoted to the sensor selection problem. A broadband receiver beamforming working in a near-field is considered. The system response should be as close as possible to the desired one, which is optimized in the sense of L2 norm. The problem considered is at least NP-hard. Therefore, the branch-and-bound algorithm is developed to solve the problem. The proposed approach is universal and can be applied not only to microphone arrays but also to antenna arrays; that is, the methodology for the generation of consecutive solutions can be applied to different types of sensor selection problems. Next, for a larger microphone array, an efficient metaheuristic algorithm is constructed. The algorithm implemented is a hybrid genetic algorithm based on the ITÖ process. Numerical experiments show that the proposed approach can be successfully applied to the sensor selection problem.


Introduction
Beamforming is a technique widely used in acoustic signal processing to improve microphone directionality.This technique involves combining many microphones into an array system distributed in space according to specific patterns [1].Such arrays, called microphone arrays, are particularly effective when there are multiple sound sources and the desired signal must be separated from unwanted noise.Additionally, the microphone array can be used to estimate the direction of arrival of the wave (DOA).An important aspect is the fact that signals can be separated even when they occupy the same frequency band but arrive at the array from different points in space.This approach is called spatial filtering.The system acts as a spatial filter that consolidates acoustic signals received by individual microphones, i.e., creates a beam; therefore, the process is called beamforming.For wide-band signals (such as speech), a wide-band beamformer is equivalent to applying finite impulse response (FIR) filters to each microphone output and then summing these signals.The coefficients of such a system are selected in such a way that the system's performance meets a given criterion, e.g., maximizing the signal, minimizing the noise level, or eliminating it.However, the efficiency of the systems depends not only on the filter coefficients and their length but also on the array configuration.Thus, a key step in beamforming is the selection of the appropriate spatial configuration of the microphones to optimize the performance of the beamforming system [2].
The selection problem involves selecting the most appropriate set of microphones from those that make up the array.The selected set can effectively estimate the desired signal while minimizing interference from undesirable sources [3][4][5].The selection process can significantly affect the accuracy and quality of the beamforming results.Therefore, it is important to develop robust and efficient sensor selection methods to achieve optimal beamforming performance.
In general, several approaches have been proposed in the literature for sensor selection in near-field beamforming.A common method is based on a spatial coherence array, which measures the spatial correlation between sensors.The most suitable subset of beamforming sensors with the desired properties can be identified by analyzing the characteristics of the coherence array [6][7][8].
The classical approach to the spatial filtering problem utilizes one of the standard microphone layouts (e.g., horizontal, vertical, spherical, or equally spaced in a rectangular array).It is a convenient approach since we can use a set of microphones that are mounted on a tripod (a fixed microphone array can be used).To improve the efficiency of such a system, the location of each microphone can be manipulated in one or two directions.But this solution needs some additional mechanics that allow for precise movements of particular microphones or groups of them.The other solution is based on the natural assumption that if more microphones are utilized, better results of beamforming can be reached.Thus, a microphone array would be spanned to form a larger geometrical structure.However, increasing the number of microphones is not always an optimal solution.This type of solution not only increases the cost of the array and the power consumption, but does not have to improve the desired criterion value as well.In this case of having a large microphone array, only a selected set of microphones is utilized, while others are inactive and do not participate in the beamforming.This allows one to use the same large array in a different scenario; each time, a different set of microphones gives optimal beamforming quality.This technique is called array thinning and has been successfully implemented in antenna arrays [9][10][11][12][13][14][15].However, there are only a few papers devoted to thinning microphone arrays [16,17].
Gao et al. [18] considered a problem of sparse beamformer design in the case of the norm L 1 and, based on the properties of this norm, they were able to reduce the size of the microphone array.In [16], a thinning of a microphone array was proposed based on the Taguchi method.The authors studied the effectiveness of the Taguchi method in determining the microphone configuration.
Another approach to sensor selection in near-field beamforming involves the use of optimization algorithms such as genetic algorithms [19], particle swarm optimization [20][21][22], or simulated annealing [17,23,24].These methods aim to find the optimal sensor combination that maximizes a beamforming performance metric, such as the signal-to-interference plus noise ratio (SINR) or beamforming gain.In recent years, machine learning techniques have also gained popularity in the selection of near-field beamforming sensors.These methods take advantage of the power of data-driven algorithms to learn patterns and relationships from training data, enabling the identification of sensors that provide the most beamforming information [25].
The problem of choosing the optimal set of microphones considered in the paper is at least NP-hard and cannot be solved optimally in polynomial time (unless P = NP).To solve this problem, metaheuristic algorithms were implemented for the antenna array [9,10] and for the microphone array [17,19], and, since this is a discrete optimization problem, an efficient branch-and-bound algorithm was developed for the signal-to-interferenceplus-noise ratio (SINR) optimization criterion [26].However, applying exact methods for different sensor selection problems is rare, and this approach has not been studied in a sufficient way.In many cases, the problem is relaxed to convex optimization problem and solved with available gradient methods or greedy algorithms [3,4].
Therefore, in this paper, a branch-and-bound algorithm, which is an exact method, is proposed to solve the microphone array thinning problem.The presented approach is universal and can be applied not only to microphone arrays, but also to antenna arrays, that is, the methodology for the generation of consecutive solutions can be applied to different types of sensor selection problem.The algorithm that generates a solution tree that must be searched using a depth search method is provided.The solution is built from scratch, and, at each step, one more microphone is turned on.The child node always has one more active microphone than its parent.The presented approach to the problem considered is new and has not yet been considered in the scientific literature.
The paper is organized as follows.Section 2 describes the problem in detail.The proposed solution is presented in Section 3, while, in Section 4, an experimental analysis of the proposed approach is examined and the results of the algorithm developed are presented.Next, in Section 5, we discuss the possibility of using the provided coding scheme in heuristic methods, called greedy algorithms.The work ends with a short summary and future research ideas in Section 6.

Problem Formulation
In this paper, we focus on the sensor selection problem for the near-field broadband beamforming problem, in which the system response should be as close as possible to the desired one.Formally, the problem considered can be defined as follows.A microphone array consisting of N microphones is given.The geometric center of the array is at the point denoted r c .The microphones do not have to be equally spaced; the array can be of any shape, and there are no restrictions on microphone positions, e.g., microphones can be placed in a rectangular or spherical shape.The signals from the microphones are sampled synchronously and then the digital signals are directed to the inputs of the FIR filters of length L (each array element consists of the microphone followed by a L-tap FIR filter).At a given time, a chosen number of microphones K = {1, 2, . . ., N) can be active, while others remain inactive.
The solution can be defined by the position vector of the microphones (which follows directly from the set of active microphones λ = (λ(1), λ(2), . . ., λ(i), . . ., λ(K))).The microphone λ(i), that is, the i-th active microphone, has position r i .Let Λ denote the set of all possible permutations of active and inactive microphones.There are 2 N − 1 possible solutions because the solution with all inactive microphones must be excluded by definition.
The transfer function of microphone λ(i) in the near field is a function of where c is the speed of sound in the air and r is the location of the sound source.The frequency response of the microphone λ(i) filter is given by where denotes the coefficients of the i-th FIR filter of length L and For the given set of active microphones λ, a system response can be found by solving the following equation: where ] is a vector containing transfer functions of all microphones for set λ and H is the frequency filter response vector for set λ.
The desired response of the system G d (r, f ) is also defined, where r is the location of the sound source and f is a frequency.The problem addressed in this paper is to design the microphone array (i.e., to determine the set of active microphones-their number, positions, and the FIR filter coefficients) so that the actual output of the beamformer is as close as possible to the desired one in the sense of l 2 norm.
Taking the system response G λ (r, h, f ) allows one to calculate a vector of filter coefficients h λ that minimizes the objective function: where σ(r, f ) is a positive weighting function, while Ω defines spatial-frequency domain.Usually, Ω consists of the region of the passband Ω P and the stopband Ω S , that is, Ω = Ω P ∪ Ω S .Therefore, the problem of finding the optimal set of coefficients (frequency response) can be determined by solving the quadratic problem, which can be solved very quickly using quadratic programming techniques [27]: where and Γ = u( f ) + jv( f ), where u( f ), v( f ) are continuous and integrable, and there exist left and right derivatives such that v(0) = 0 and v( f s /2) = 0.
Based on [27], one can assume that there is a performance limit for finite filter length designs, and further increasing the length of the filter does not significantly improve the criterion value.For a filter of sufficient, fixed length, one can write the beamformer design problem as min

Algorithms
Since the problem considered is at least NP-hard, there does not exist an optimal algorithm with polynomial complexity (unless P = NP).Therefore, this paper focuses on an exact algorithm belonging to the branch-and-bound (B&B) technique.It is an algorithm paradigm that must be completed for each specific type of problem, and there exist numerous choices for each of the components [28].As B&B can only be applied to small instances, a metaheuristic hybrid algorithm is proposed to find a satisfactory solution for larger microphone arrays.

Branch-and-Bound
This section is devoted to the B&B algorithm for the general problem of choosing the set of active/inactive devices.The general scheme of this type of algorithm can be described as follows.The set of candidate solutions is kept as a tree with a root.The branches of this tree contain subsets of the solution set (each node is a partial solution and part of the solution set).Starting with the root, the algorithm explores the branches using a depth-first or breadth-first search of this tree.
Before the algorithm starts, a way to determine how to calculate an upper bound (UB) should be provided.An upper bound is usually a criterion value for a particular solution.At first, it is typically calculated in advance using a heuristic, and the value herein is used as the current best-known solution.If the heuristic method is not applied, UB can be set to infinity for the minimization problem.Next, for each node visited, a lower bound (LB) is calculated.The lower bound is usually a criterion value for the relaxed problem, ensuring that, for all feasible solutions, the modified function has values less than or equal to the original function, to determine whether it is worth continuing deeper into the analyzed branch or whether this branch can be excluded from further examination.If this value LB is greater than a given upper bound, then the examined subset is removed; the branch is pruned.Therefore, it is possible to reduce the number of solutions tested.The efficiency (number of nodes visited) of the B&B scheme strictly depends on the values of UB and LB.
The algorithm starts with an initial solution, calculates its criterion value E 0 (r, h), and sets UB = E 0 (r, h).The solution is then built from scratch.Initially, a microphone is added to the solution and the lower bound for this branch is calculated.If the lower bound is smaller than the upper bound, B&B progresses deeper, that is, one more microphone is added to our solution and the lower bound is calculated and compared with the upper bound value.If the lower bound is greater than the upper bound, this branch is pruned.The general scheme of B&B is shown in Figure 1 and is described in Algorithm 1.While the set of solutions to check is not empty: For node i calculate LB according to Equation ( 9) and current criterion value ) and else if LB ≥ UB cut the node i and go to the next branch as in Figure 2 else go to node i + 1 Show the best solution λ best Since an amplitude criterion is considered, one can assume that the best improvement in adding one microphone is 6 dB.This estimate is based on the best-possible case.Consider the case of an array consisting of two microphones.If a narrowband signal reaches the array (for simplicity, let us assume a sinusoidal signal), then the signal in one of the channels should be delayed to equalize the phases of both signals; then, sum both recorded signals.Since the noise in both channels is not correlated, as was assumed at the beginning of the considerations, the amplitude of the sinusoidal signals will be amplified twice, while the noise amplitude will be reduced.A good estimate of the increase in SNR in this situation is 6dB.Of course, this happens in the most favorable situation because, in reality, we are dealing with wide-band signals.
The search tree is a binary tree with depth M, (m = 1, . . ., M) and, at each level m, it has m active microphones.We explore our solution tree using a depth-first method, starting with the node at which all devices are inactive.We denote this level m of the tree as 0. Next, we continue deeper into the tree.Let us denote the solution represented by node n at level m as λ n and its criterion value as E λ n (r, h).This solution can be extended by selecting more microphones; however, the number of microphones available is bounded by the number of levels below the node n (let us denote this number by n m ).Therefore, the LB for the given node branch is calculated as follows: For each node, its criterion value E λ n (r, h) is compared with UB.If it is smaller than UB, then UB = E λ n (r, h).
It is difficult to extract the exact impact of a single sensor at the criterion value and estimate LB more efficiently.An example of pruning is shown in Figure 2.

Heuristic Algorithms
In addition to the B&B scheme, two heuristic algorithms and one hybrid metaheuristic algorithm were proposed.The first one is based on B&B scheme.The branch is pruned if adding (turning on) one additional microphone does not improve criterion value.The provided solution is a local optimum.The general scheme of the algorithm is presented below (see Algorithm 2).

Algorithm 2: HA1
Set crit = E(λ best ) for criterion value calculated for the best solution with one active microphone.Start depth-first search of solution three.While the set of solutions to check is not empty: For node i calculate current criterion value E 0 (r, h) In the second proposed algorithm, the solution is built from scratch.At each step, a microphone that provides the best improvement of criterion value is chosen and added to the solution.The algorithm stops if selecting additional microphone does not improve the criterion value (greedy approach), see Algorithm 3.
Next, an algorithm proposed by Dong et al. [29] was adapted and implemented for our problem.It is a hybrid genetic algorithm that contains methods known as cuckoo search and simulated annealing.This algorithm is called NGA in the later part of this paper.As in genetic algorithms, a set of chromosomes is given, which can be called particles in some hybrid methods; these chromosomes are subjected to a crossover operation.Each chromosome represents an individual solution.The crossover operator is based on the ITÖ process and its length depends on the particle radius, the environment temperature, and the activity intensity.After the crossover is performed, new individuals (particles) are formed and the set of new solutions is checked (Algorithm 3).

Algorithm 3: HA2
Set crit = E(λ best ) for criterion value calculated for the best solution with one active microphone.Add the microphone that provides the best improvement of criterion value; set crit = E(λ best ) for this solution If adding any microphone does not improve the criterion value, show the best solution λ best At the beginning, a set of X random particles is given.Next, these particles are classified according to the best-to-worst order and are represented by x i ∈ 1, . . ., X, where X is the number of particles.In the next step, the radius of the particle for each particle x i is calculated as follows: where rad max and rad min are maximum and minimum particle radius, respectively, and all particle radii are uniformly distributed in rad max and rad min .
As in the simulated annealing algorithm, the environment temperature is gradually reduced and this process is defined by T it = γ * T it−1 , where it is the number of iterations and γ < 1 is a cooling coefficient.
Based on the radius of each particle, the activity intensity of each particle x i is calculated.This parameter controls the intensity of the movement of the particles: where rad i is the radius of particle x i and T is the current temperature.
To perform a crossover, a crossover length L i is calculated for each particle x i and this value is controlled by the intensity of activity as follows: where β is a random number from uniform distribution and β ∈ [0; 1].The starting position s i of a crossover operator is randomly selected and the continuous positions L i in the particle x i are crossed with the best solution.Each particle codes a solution and is represented by a vector of real numbers uniformly distributed in [0;1]; the length of the vector is equal to the size of the array, that is, the number of microphones available.During decoding, the value of each element of the vector is rounded.Next, if it is equal to 1, this element is active; otherwise, it is inactive, that is, it is not chosen.Since the activity of each microphone is coded as a real number, the following crossover operator is used: where j = s i , s i + 1, . . ., s i + L i , α = {0; 1} is a crossover coefficient, and best_one(j) is the current best solution (Algorithm 4).
For Ω 3 , the passband region is defined as follows: and the stopband is The desired responses for Ω 1 , Ω 2 , Ω 3 are presented in Figures 3-5, respectively.The placement feasible region is for an array of maximum size 4 × 4 microphones and for arrays of size 5 × 5 and 6 × 6. Obviously, the space of possible microphone locations cannot contain the location of the sound source.The numerical experiment starts with the microphone array that is, at the beginning, equispaced and fills the entire region.Both the passband and stopband are discretized, the frequency points are taken every 0.1 kHz, and the spatial points are taken every 0.1 m.
It should be noted that the running time of the algorithm for only one array configuration is long.It takes ca.0.18 s for an array of four microphones and increases with the number of microphones.

Branch-and-Bound
In this section, the results of numerical experiments for the B&B algorithm are presented.As an upper bound, a value of the criterion of the solution provided by simulated annealing is taken [23].This easy-to-implement, local search metaheuristic is fast and efficient for many discrete optimization problems.Its main steps are presented below (Algorithm 5).The number of iterations is indicated by max i t; T and γ are the initial temperature and cooling parameters, respectively.In the implemented version of this algorithm, the number of iterations strictly depends on the size of the problem.Due to the long running time, it is limited to the half-size of the solution space: 0.5 × 2 N , where N denotes the number of microphones in the array.For a 12-microphone array, the number of SA iterations was reduced to 500.
As the UB is calculated, the B&B algorithm is run.We ran the algorithm 20 times.The results of B&B are collected in Table 1.In the second column, the microphone configuration is shown (rows and columns).In the third row, the mean and minimum criterion values of UB (in brackets) are collected.These values were provided by a simulated annealing algorithm.Next, in the fourth column, the optimal criterion value is presented for the given configuration.The fifth column consists of the mean and minimum number (in brackets) of nodes that were visited, that is, solutions that were checked.The mean running time of the B&B algorithm is presented in the last column.The example of the optimal solution (set of active microphones) and its response are presented in Figures 6 and 7, respectively.

Heuristic Algorithms
The results presented in Table 1 show that the proposed algorithm B&B reduces the calculation time and the number of nodes visited; however, its running time is long and increases exponentially.Therefore, two greedy algorithms and a metaheuristic approach based on the hybrid genetic method proposed in [29] were examined.The parameters of the algorithm were chosen experimentally.The number of particles is equal to the number of microphones.The initial parameters of the NGAs are as follows: the initial environment temperature is 1000, the total number of iterations (stopping condition) is the number of microphones multiplied by 3, rad max = 1, rad min = 0, cooling coefficient γ = 0.9, and crossover coefficient α = 0.1.The results of the proposed metaheuristic algorithm are collected in Table 2.For each instance, the algorithm was executed 20 times.The second column contains the microphone configuration (rows and columns).In the third row, the mean and minimum criterion values (in brackets) are gathered.The mean running time of the algorithm NGA is presented in the last column.Results of algorithms H A1 and H A2 are gathered in columns 5-6 and 7-8, respectively.
The proposed metaheuristic approach can find an optimal solution for smaller instances (for an array of four and six microphones, it found the optimal value of the criterion almost every time).For a greater number of microphones, the metaheuristic approach is able to provide satisfying criterion values.However, to achieve a good solution, the running time of the algorithm should be long and the number of iterations should depend on the size of the array.
The H A1 method is efficient only for small instances.Its running time is short, and results are close to optimal.However, for bigger instances (5 × 5 and 6 × 6), the running time of the algorithm was unacceptable, and it stopped after 10 min.The criterion values provided by H A2 are comparable with those of H A1 and NGA.Its running time is the shortest.In the case of instances with a greater number of sensors, the greedy approach is the most efficient.It can provide a good solution in a short time.For smaller instances, NGA was the best one and its solutions were close or even globally optimal.
Future work should focus on answering the following questions.Is it possible to provide a better estimate of LB?Is it possible to extract the impact of a specific, single microphone on a value of the criterion?Is it possible to provide elimination procedures that significantly reduce the number of visited nodes?

Conclusions
In this paper, a general branch-and-bound algorithm was proposed for the sensor selection problem.During the depth-search method, successive solutions in one branch differed only in one bit, which is equivalent to microphone activation/disactivation.The proposed general pruning scheme decreases the number of nodes visited (checked solutions).Then, an efficient metaheuristic approach was proposed and examined during numerical experiments.Future work will focus on providing efficient elimination procedures for the branch-and-bound algorithm.

Figure 1 .Algorithm 1 :
Figure 1.Example of solution tree for 4 microphones; 1 denotes an active microphone, while 0 is an inactive microphone.Algorithm 1: Branch-and-Bound Set UB = E(λ best ) provided by Simulated Annealing.Start depth-first search of solution three.While the set of solutions to check is not empty: For node i calculate LB according to Equation (9) and current criterion valueE 0 (r, h) If E 0 (r, h) < UB set UB = E 0 (r, h) and else if LB ≥ UB cut the node i and go to the next branch as in Figure 2 else go to node i + 1 Show the best solution λ best
h) and go to node i + 1 else cut the node i and go to the next branch as in Figure2Show the best solution λ best

Figure 3 .
Figure 3. Desired response of the system−the dark spaces denote stopband regions, the lighter space denotes passband region for Ω 1 .

Figure 4 .Figure 5 .
Figure 4. Desired response of the system−the dark spaces denote stopband regions, the lighter space denotes passband region for Ω 2 .

Algorithm 5 :
Simulated annealingDefine objective function min λ∈Λ(M),h∈R KxL (E(r, h), λ) Calculate criterion for the full active array λ f , set T, max it and γ,E best = E(λ f ) Generate a random initial solution λ a , E curr = E(λ a ) While (iter < max it ) or (stop criterion)Choose λ anew by a random switch of two microphone activeness and negate activeness of two random microphones Assign λ a = λ ′ anew with probabilityP(T, λ a , λ anew ) = min 1, exp −(E(λ anew )−E(λ a ) T If E(λ anew ) < E best λ best = λ anew T = T 1+γTShow the best solution λ best and E best

Table 1 .
The results of B&B algorithm.